Working with JSON documents

JSON documents are very popular, especially when it comes to API responces and/or financial data. They provide nice, dictionary-like interface to data with the opportunity of working with keys rather than indecies only. Thus, Python has a built-in support for JSON documents with necessary ready-made functions. To access those functions, one needs to import the JSON library, which comes directly installed with Python.


In [3]:
import json

Let's create a sample JSON fie and save it to some variable called input.


In [4]:
input = '''[
  { "id" : "01",
    "status" : "Instructor",
    "name" : "Hrant"
  } ,
  { "id" : "02",
    "status" : "Student",
    "name" : "Jimmy"
  }
]'''

As you can see here, our JSON documents is nothing else than a list of two dictionaires with 3 keys each (and a value for each key). To parse it as a usual Python object (list in this case), the loads() function from the json package is used.


In [5]:
# parse/load string
data = json.loads(input)
# data is a usual list

In [6]:
type(data)


Out[6]:
list

In [7]:
print(data)


[{u'status': u'Instructor', u'id': u'01', u'name': u'Hrant'}, {u'status': u'Student', u'id': u'02', u'name': u'Jimmy'}]

In [8]:
from pprint import pprint

In [9]:
pprint(data)


[{u'id': u'01', u'name': u'Hrant', u'status': u'Instructor'},
 {u'id': u'02', u'name': u'Jimmy', u'status': u'Student'}]

In [10]:
print 'User count:', len(data), "\n"


User count: 2 


In [11]:
data[0]['name']


Out[11]:
u'Hrant'

In [12]:
for element in data:
    print 'Name: ', element['name']
    print 'Id: ', element['id']
    print 'Status: ', element['status'], "\n"


Name:  Hrant
Id:  01
Status:  Instructor 

Name:  Jimmy
Id:  02
Status:  Student 

Reading JSON from a file

using Pandas


In [13]:
import pandas as pd

In [14]:
address = "C:\Data_scraping\JSON\sample_data.json"

In [15]:
my_json_data = pd.read_json(address)

In [16]:
my_json_data.head()


Out[16]:
id name status
0 1 Hrant Instructor
1 2 Jack Student

using with open()


In [17]:
import json

In [18]:
with open(address,"r") as file:
    local_json = json.load(file)

In [19]:
print(local_json)


[{u'status': u'Instructor', u'id': u'01', u'name': u'Hrant'}, {u'status': u'Student', u'id': u'02', u'name': u'Jack'}]

In [20]:
type(local_json)


Out[20]:
list

In [21]:
pprint(local_json)


[{u'id': u'01', u'name': u'Hrant', u'status': u'Instructor'},
 {u'id': u'02', u'name': u'Jack', u'status': u'Student'}]

Writing JSON files


In [22]:
with open('our_json_w.json', 'w') as output:
    json.dump(local_json, output)

Yet, as you may have already noticed, the saved JSON files not human-readible. To make them more user friendly, we may sort the Keys and provide 4-tab indentation.


In [23]:
with open('our_json_w.json', 'w') as output:
    json.dump(local_json, output, sort_keys = True, indent = 4)

Converting JSON to CSV


In [25]:
import csv, json
address = "C:\Data_scraping\JSON\sample_data.json"
with open(address,"r") as file:
    local_json = json.load(file)

with open("from_json.csv", "w") as f:
    writer = csv.writer(f)
    writer.writerow(["ID","Name","Status"])
    for item in local_json:
        writer.writerow([item['id'],item['name'],item['status']])